Rover Project Test Notebook

This notebook contains the functions from the lesson and provides the scaffolding you need to test out your mapping methods. The steps you need to complete in this notebook for the project are the following:

  • First just run each of the cells in the notebook, examine the code and the results of each.
  • Run the simulator in "Training Mode" and record some data. Note: the simulator may crash if you try to record a large (longer than a few minutes) dataset, but you don't need a ton of data, just some example images to work with.
  • Change the data directory path (2 cells below) to be the directory where you saved data
  • Test out the functions provided on your data
  • Write new functions (or modify existing ones) to report and map out detections of obstacles and rock samples (yellow rocks)
  • Populate the process_image() function with the appropriate steps/functions to go from a raw image to a worldmap.
  • Run the cell that calls process_image() using moviepy functions to create video output
  • Once you have mapping working, move on to modifying perception.py and decision.py to allow your rover to navigate and map in autonomous mode!

Note: If, at any point, you encounter frozen display windows or other confounding issues, you can always start again with a clean slate by going to the "Kernel" menu above and selecting "Restart & Clear Output".

Run the next cell to get code highlighting in the markdown cells.

In [1]:
%%HTML
<style> code {background-color : orange !important;} </style>
In [2]:
%matplotlib inline
#%matplotlib qt # Choose %matplotlib qt to plot to an interactive window (note it may show up behind your browser)
# Make some of the relevant imports
import cv2 # OpenCV for perspective transform
import numpy as np
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import scipy.misc # For saving images as needed
import scipy.ndimage as ndimage
import glob  # For reading in a list of images from a folder
import imageio
imageio.plugins.ffmpeg.download()

Quick Look at the Data

There's some example data provided in the test_dataset folder. This basic dataset is enough to get you up and running but if you want to hone your methods more carefully you should record some data of your own to sample various scenarios in the simulator.

Next, read in and display a random image from the test_dataset folder

In [3]:
path = '../test_dataset/IMG/*'
img_list = glob.glob(path)
# Grab a random image and display it
idx = np.random.randint(0, len(img_list)-1)
image = mpimg.imread(img_list[idx])
plt.imshow(image)
Out[3]:
<matplotlib.image.AxesImage at 0x7f90b90c36d8>

Calibration Data

Read in and display example grid and rock sample calibration images. You'll use the grid for perspective transform and the rock image for creating a new color selection that identifies these samples of interest.

In [4]:
# In the simulator you can toggle on a grid on the ground for calibration
# You can also toggle on the rock samples with the 0 (zero) key.  
# Here's an example of the grid and one of the rocks
example_grid = '../calibration_images/example_grid1.jpg'
example_rock = '../calibration_images/example_rock1.jpg'
grid_img = mpimg.imread(example_grid)
rock_img = mpimg.imread(example_rock)

fig = plt.figure(figsize=(12,3))
plt.subplot(121)
plt.imshow(grid_img)
plt.subplot(122)
plt.imshow(rock_img)
Out[4]:
<matplotlib.image.AxesImage at 0x7f90b37b5a58>

Perspective Transform

Define the perspective transform function from the lesson and test it on an image.

In [5]:
# Define a function to perform a perspective transform
# I've used the example grid image above to choose source points for the
# grid cell in front of the rover (each grid cell is 1 square meter in the sim)
# Define a function to perform a perspective transform
def perspect_transform(img, src, dst):
           
    M = cv2.getPerspectiveTransform(src, dst)
    warped = cv2.warpPerspective(img, M, (img.shape[1], img.shape[0]))# keep same size as input image
    
    return warped


# Define calibration box in source (actual) and destination (desired) coordinates
# These source and destination points are defined to warp the image
# to a grid where each 10x10 pixel square represents 1 square meter
# The destination box will be 2*dst_size on each side
dst_size = 5 
# Set a bottom offset to account for the fact that the bottom of the image 
# is not the position of the rover but a bit in front of it
# this is just a rough guess, feel free to change it!
bottom_offset = 6
source = np.float32([[14, 140], [301 ,140],[200, 96], [118, 96]])
destination = np.float32([[image.shape[1]/2 - dst_size, image.shape[0] - bottom_offset],
                  [image.shape[1]/2 + dst_size, image.shape[0] - bottom_offset],
                  [image.shape[1]/2 + dst_size, image.shape[0] - 2*dst_size - bottom_offset], 
                  [image.shape[1]/2 - dst_size, image.shape[0] - 2*dst_size - bottom_offset],
                  ])
warped = perspect_transform(grid_img, source, destination)
plt.imshow(warped)
#scipy.misc.imsave('../output/warped_example.jpg', warped)
Out[5]:
<matplotlib.image.AxesImage at 0x7f90b3794f60>
In [6]:
# test pitch correction for perspective transform
lw_grid_1 = '../lw_calibration_images/grid_pitch_0_04.jpg'
lw_grid_2 = '../lw_calibration_images/grid_pitch_1_18.jpg'
lw_grid_1_img = mpimg.imread(lw_grid_1)
lw_grid_2_img = mpimg.imread(lw_grid_2)


# account for different pitches
pitch_lw2 = 1.18
pitch_lw2 = pitch_lw2 if pitch_lw2 < 180 else pitch_lw2 - 360
print(pitch_lw2)
source_corrected = (source-np.array([0.0, 2*pitch_lw2])).astype(np.float32)
warped_lw_1 = perspect_transform(lw_grid_1_img, source, destination)
warped_lw_2 = perspect_transform(lw_grid_2_img, source, destination)
warped_lw_2_alt = perspect_transform(lw_grid_2_img, source_corrected, destination)

fig = plt.figure(figsize=(12,9))
plt.subplot(231)
plt.imshow(lw_grid_1_img)
plt.subplot(232)
plt.imshow(lw_grid_2_img)
plt.subplot(233)
plt.imshow(lw_grid_2_img)
plt.subplot(234)
plt.imshow(warped_lw_1)
plt.subplot(235)
plt.imshow(warped_lw_2)
plt.subplot(236)
plt.imshow(warped_lw_2_alt)
1.18
Out[6]:
<matplotlib.image.AxesImage at 0x7f90b36eb048>

Color Thresholding

Define the color thresholding function from the lesson and apply it to the warped image

TODO: Ultimately, you want your map to not just include navigable terrain but also obstacles and the positions of the rock samples you're searching for. Modify this function or write a new function that returns the pixel locations of obstacles (areas below the threshold) and rock samples (yellow rocks in calibration images), such that you can map these areas into world coordinates as well.
Hints and Suggestion:

  • For obstacles you can just invert your color selection that you used to detect ground pixels, i.e., if you've decided that everything above the threshold is navigable terrain, then everthing below the threshold must be an obstacle!
  • For rocks, think about imposing a lower and upper boundary in your color selection to be more specific about choosing colors. You can investigate the colors of the rocks (the RGB pixel values) in an interactive matplotlib window to get a feel for the appropriate threshold range (keep in mind you may want different ranges for each of R, G and B!). Feel free to get creative and even bring in functions from other libraries. Here's an example of color selection using OpenCV.

  • Beware However: if you start manipulating images with OpenCV, keep in mind that it defaults to BGR instead of RGB color space when reading/writing images, so things can get confusing.

In [7]:
# Identify pixels above the threshold
# Threshold of RGB > 160 does a nice job of identifying ground pixels only
def color_thresh(img, rgb_thresh=(160, 160, 160), upper_rgb_thresh=(255,255,255)):
    return cv2.inRange(img, rgb_thresh, upper_rgb_thresh)

#rock_warped = perspect_transform(rock_img, source, destination)
#rock_threshed = color_thresh(rock_warped, (100, 100, 0), (255, 255, 50))

threshed = color_thresh(warped)
threshed = ndimage.binary_closing(threshed).astype(int)
plt.imshow(threshed, cmap='gray')
#scipy.misc.imsave('../output/warped_threshed.jpg', threshed*255)
Out[7]:
<matplotlib.image.AxesImage at 0x7f90b33d4a90>

Coordinate Transformations

Define the functions used to do coordinate transforms and apply them to an image.

In [8]:
# Define a function to convert from image coords to rover coords
def rover_coords(binary_img):
    # Identify nonzero pixels
    ypos, xpos = binary_img.nonzero()
    # Calculate pixel positions with reference to the rover position being at the 
    # center bottom of the image.  
    x_pixel = -(ypos - binary_img.shape[0]).astype(np.float)
    y_pixel = -(xpos - binary_img.shape[1]/2 ).astype(np.float)
    return x_pixel, y_pixel

# Define a function to convert to radial coords in rover space
def to_polar_coords(x_pixel, y_pixel):
    # Convert (x_pixel, y_pixel) to (distance, angle) 
    # in polar coordinates in rover space
    # Calculate distance to each pixel
    dist = np.sqrt(x_pixel**2 + y_pixel**2)
    # Calculate angle away from vertical for each pixel
    angles = np.arctan2(y_pixel, x_pixel)
    return dist, angles

# Define a function to map rover space pixels to world space
def rotate_pix(xpix, ypix, yaw):
    # Convert yaw to radians
    yaw_rad = yaw * np.pi / 180
    xpix_rotated = (xpix * np.cos(yaw_rad)) - (ypix * np.sin(yaw_rad))
                            
    ypix_rotated = (xpix * np.sin(yaw_rad)) + (ypix * np.cos(yaw_rad))
    # Return the result  
    return xpix_rotated, ypix_rotated

def translate_pix(xpix_rot, ypix_rot, xpos, ypos, scale): 
    # Apply a scaling and a translation
    xpix_translated = (xpix_rot / scale) + xpos
    ypix_translated = (ypix_rot / scale) + ypos
    # Return the result  
    return xpix_translated, ypix_translated


# Define a function to apply rotation and translation (and clipping)
# Once you define the two functions above this function should work
def pix_to_world(xpix, ypix, xpos, ypos, yaw, world_size, scale):
    # Apply rotation
    xpix_rot, ypix_rot = rotate_pix(xpix, ypix, yaw)
    # Apply translation
    xpix_tran, ypix_tran = translate_pix(xpix_rot, ypix_rot, xpos, ypos, scale)
    # Perform rotation, translation and clipping all at once
    x_pix_world = np.clip(np.int_(xpix_tran), 0, world_size - 1)
    y_pix_world = np.clip(np.int_(ypix_tran), 0, world_size - 1)
    # Return the result
    return x_pix_world, y_pix_world

# Grab another random image
idx = np.random.randint(0, len(img_list)-1)
image = mpimg.imread(img_list[idx])
warped = perspect_transform(image, source, destination)
threshed = color_thresh(warped)

# Calculate pixel values in rover-centric coords and distance/angle to all pixels
xpix, ypix = rover_coords(threshed)
dist, angles = to_polar_coords(xpix, ypix)
mean_dir = np.mean(angles)

# Do some plotting
fig = plt.figure(figsize=(12,9))
plt.subplot(221)
plt.imshow(image)
plt.subplot(222)
plt.imshow(warped)
plt.subplot(223)
plt.imshow(threshed, cmap='gray')
plt.subplot(224)
plt.plot(xpix, ypix, '.')
plt.ylim(-160, 160)
plt.xlim(0, 160)
arrow_length = 100
x_arrow = arrow_length * np.cos(mean_dir)
y_arrow = arrow_length * np.sin(mean_dir)
plt.arrow(0, 0, x_arrow, y_arrow, color='red', zorder=2, head_width=10, width=2)
Out[8]:
<matplotlib.patches.FancyArrow at 0x7f90b35cad30>

Read in saved data and ground truth map of the world

The next cell is all setup to read your saved data into a pandas dataframe. Here you'll also read in a "ground truth" map of the world, where white pixels (pixel value = 1) represent navigable terrain.

After that, we'll define a class to store telemetry data and pathnames to images. When you instantiate this class (data = Databucket()) you'll have a global variable called data that you can refer to for telemetry and map data within the process_image() function in the following cell.

In [25]:
# Import pandas and read in csv file as a dataframe
import pandas as pd
# Change the path below to your data directory
# If you are in a locale (e.g., Europe) that uses ',' as the decimal separator
# change the '.' to ','
#df = pd.read_csv('../test_dataset/robot_log.csv', delimiter=';', decimal='.')
df = pd.read_csv('../lw_dataset/robot_log.csv', delimiter=';', decimal='.')
csv_img_list = df["Path"].tolist() # Create list of image pathnames
# Read in ground truth map and create a 3-channel image with it
ground_truth = mpimg.imread('../calibration_images/map_bw.png')
ground_truth_3d = np.dstack((ground_truth*0, ground_truth*255, ground_truth*0)).astype(np.float)

# Creating a class to be the data container
# Will read in saved data from csv file and populate this object
# Worldmap is instantiated as 200 x 200 grids corresponding 
# to a 200m x 200m space (same size as the ground truth map: 200 x 200 pixels)
# This encompasses the full range of output position values in x and y from the sim
class Databucket():
    def __init__(self):
        self.images = csv_img_list  
        self.xpos = df["X_Position"].values
        self.ypos = df["Y_Position"].values
        #self.roll = df["Roll"].values
        #self.pitch = df["Pitch"].values
        self.yaw = df["Yaw"].values     
        self.pitch = df["Pitch"].values
        self.roll = df["Roll"].values
        self.count = 0 # This will be a running index
        self.worldmap = np.zeros((200, 200, 3)).astype(np.float)
        self.ground_truth = ground_truth_3d # Ground truth worldmap
        self.occupancy = np.zeros((200, 200), dtype=np.float) 

# Instantiate a Databucket().. this will be a global variable/object
# that you can refer to in the process_image() function below
data = Databucket()

Write a function to process stored images

Modify the process_image() function below by adding in the perception step processes (functions defined above) to perform image analysis and mapping. The following cell is all set up to use this process_image() function in conjunction with the moviepy video processing package to create a video from the images you saved taking data in the simulator.

In short, you will be passing individual images into process_image() and building up an image called output_image that will be stored as one frame of video. You can make a mosaic of the various steps of your analysis process and add text as you like (example provided below).

To start with, you can simply run the next three cells to see what happens, but then go ahead and modify them such that the output video demonstrates your mapping process. Feel free to get creative!

In [26]:
# Define a function to pass stored images to
# reading rover position and yaw angle from csv file
# This function will be used by moviepy to create an output video
def process_image(img):
    # Example of how to use the Databucket() object defined above
    # to print the current x, y and yaw values 
    # print(data.xpos[data.count], data.ypos[data.count], data.yaw[data.count])

    # TODO: 
    # 1) Define source and destination points for perspective transform
    # 2) Apply perspective transform
    # 3) Apply color threshold to identify navigable terrain/obstacles/rock samples
    # 4) Convert thresholded image pixel values to rover-centric coords
    # 5) Convert rover-centric pixel values to world coords
    # 6) Update worldmap (to be displayed on right side of screen)
        # Example: data.worldmap[obstacle_y_world, obstacle_x_world, 0] += 1
        #          data.worldmap[rock_y_world, rock_x_world, 1] += 1
        #          data.worldmap[navigable_y_world, navigable_x_world, 2] += 1

    # 7) Make a mosaic image, below is some example code
        # First create a blank image (can be whatever shape you like)
        
    xpos, ypos = data.xpos[data.count], data.ypos[data.count]
    yaw, pitch, roll = data.yaw[data.count], data.pitch[data.count], data.roll[data.count]
        
    dst_size = 5 
    bottom_offset = 6
    source = np.float32([[14, 140], [301 ,140],[200, 96], [118, 96]])

    # source pitch correction 
    pitch_correction = pitch if pitch < 180 else pitch - 360
    source = (source - np.array([0.0, 2 * pitch_correction])).astype(np.float32)
    destination = np.float32([[img.shape[1]/2 - dst_size, img.shape[0] - bottom_offset],
                  [img.shape[1]/2 + dst_size, img.shape[0] - bottom_offset],
                  [img.shape[1]/2 + dst_size, img.shape[0] - 2*dst_size - bottom_offset], 
                  [img.shape[1]/2 - dst_size, img.shape[0] - 2*dst_size - bottom_offset],
                  ])
    world_scale = 20
    world_size = 200
    # 2) Apply perspective transform
    warped = perspect_transform(img, source, destination)
    
    # 3) Apply color threshold to identify navigable terrain/obstacles/rock samples
    navigable = ndimage.binary_closing(color_thresh(warped)).astype(int)
    obstacle = (1 - navigable) & perspect_transform(np.ones_like(warped[:,:,0]), source, destination)
    rock = color_thresh(warped, (100, 100, 0), (255, 255, 50))
    
        
    segmented_image = np.zeros_like(img)
    segmented_image[:,:,0] = obstacle * 255
    segmented_image[:,:,1] = rock * 255 
    segmented_image[:,:,2] = navigable * 255


    # 5) Convert map image pixel values to rover-centric coords
    # 6) Convert rover-centric pixel values to world coordinates
    navigable_x, navigable_y = rover_coords(navigable)
    obstacle_x, obstacle_y = rover_coords(obstacle)
    rock_x, rock_y = rover_coords(rock)

    # reduce influence of observations based on longer distance to rover
    obstacle_update =  np.clip(1 - to_polar_coords(obstacle_x, obstacle_y)[0]/100, 0, 1)
    navigable_update = np.clip(1 - to_polar_coords(navigable_x, navigable_y)[0]/227, 0, 1)        

    navigable_x_world, navigable_y_world = pix_to_world(navigable_x, navigable_y, xpos, ypos, yaw, world_size, world_scale)
    obstacle_x_world, obstacle_y_world = pix_to_world(obstacle_x, obstacle_y, xpos, ypos, yaw, world_size, world_scale)    
    rock_x_world, rock_y_world = pix_to_world(rock_x, rock_y, xpos, ypos, yaw, world_size, world_scale)

    
    if (roll < 5 or roll > 355): # and (pitch < 2 or pitch > 358) 
        data.occupancy[obstacle_y_world, obstacle_x_world] -= obstacle_update
        data.occupancy[navigable_y_world, navigable_x_world] += 10*navigable_update
        data.occupancy = np.clip(data.occupancy, -10, 10)
        data.worldmap[:, :, 0] = (data.occupancy < 0) * 255
        data.worldmap[:, :, 2] = (data.occupancy > 0) * 255

        if rock_x_world.any() and rock_y_world.any():        
            rock_pos = np.float32([np.mean(rock_x_world), np.mean(rock_y_world)])            
            data.worldmap[int(rock_pos[1]), int(rock_pos[0]), 1] = 255            
    else:
        print('not mapping roll = {}, pitch = {}'.format(roll, pitch))    
        
        
    output_image = np.zeros((img.shape[0] + data.worldmap.shape[0], img.shape[1]*2, 3))
        # Next you can populate regions of the image with various output
        # Here I'm putting the original image in the upper left hand corner
    output_image[0:img.shape[0], 0:img.shape[1]] = img    
    
    output_image[0:img.shape[0], img.shape[1]:] = warped
    
    # Overlay worldmap with ground truth map
    map_add = cv2.addWeighted(data.worldmap, 1, data.ground_truth, 0.5, 0)    
    
    # Flip map overlay so y-axis points upward and add to output_image 
    output_image[img.shape[0]:, 0:data.worldmap.shape[1]] = np.flipud(map_add)
    
    output_image[img.shape[0]:img.shape[0]+img.shape[0], img.shape[1]:img.shape[1]+img.shape[1]] = segmented_image    

        # Then putting some text over the image
    cv2.putText(output_image,"Populate this image with your analyses to make a video!", (20, 20), 
                cv2.FONT_HERSHEY_COMPLEX, 0.4, (255, 255, 255), 1)
    if data.count < len(data.images) - 1:
        data.count += 1 # Keep track of the index in the Databucket()
    
    return output_image

Make a video from processed image data

Use the moviepy library to process images and create a video.

In [27]:
# Import everything needed to edit/save/watch video clips
from moviepy.editor import VideoFileClip
from moviepy.editor import ImageSequenceClip


# Define pathname to save the output video
output = '../output/test_mapping.mp4'
data = Databucket() # Re-initialize data in case you're running this cell multiple times
clip = ImageSequenceClip(data.images, fps=60) # Note: output video will be sped up because 
                                          # recording rate in simulator is fps=25
new_clip = clip.fl_image(process_image) #NOTE: this function expects color images!!
%time new_clip.write_videofile(output, audio=False)
[MoviePy] >>>> Building video ../output/test_mapping.mp4
[MoviePy] Writing video ../output/test_mapping.mp4
 43%|████▎     | 732/1704 [00:12<00:16, 60.23it/s]
not mapping roll = 5.5569690000000005, pitch = 355.7372
not mapping roll = 6.377463, pitch = 356.764
not mapping roll = 7.305727, pitch = 358.0573
not mapping roll = 7.874055, pitch = 359.0231
not mapping roll = 8.221873, pitch = 359.6661
not mapping roll = 7.995789, pitch = 0.2681295
not mapping roll = 7.289289999999999, pitch = 0.27331940000000005
not mapping roll = 6.221814, pitch = 0.05948243
 95%|█████████▍| 1612/1704 [00:26<00:01, 64.15it/s]
not mapping roll = 6.145092, pitch = 346.7492
not mapping roll = 8.541758999999999, pitch = 345.5004
not mapping roll = 10.55658, pitch = 344.8391
not mapping roll = 13.71887, pitch = 345.537
not mapping roll = 17.15009, pitch = 347.1839
not mapping roll = 18.55047, pitch = 349.1959
not mapping roll = 19.86873, pitch = 351.2463
not mapping roll = 21.00403, pitch = 354.2666
not mapping roll = 20.94797, pitch = 356.0517
not mapping roll = 20.68005, pitch = 358.8982
not mapping roll = 19.58146, pitch = 1.432091
not mapping roll = 17.81149, pitch = 3.2675259999999997
not mapping roll = 14.64364, pitch = 4.927307
not mapping roll = 11.67892, pitch = 5.560383
 95%|█████████▌| 1626/1704 [00:26<00:01, 61.64it/s]
not mapping roll = 8.236602, pitch = 5.66116
100%|██████████| 1704/1704 [00:27<00:00, 60.95it/s]
[MoviePy] Done.
[MoviePy] >>>> Video ready: ../output/test_mapping.mp4 

CPU times: user 24.7 s, sys: 2.11 s, total: 26.8 s
Wall time: 28.4 s

This next cell should function as an inline video player

If this fails to render the video, try running the following cell (alternative video rendering method). You can also simply have a look at the saved mp4 in your /output folder

In [28]:
from IPython.display import HTML
HTML("""
<video width="960" height="540" controls>
  <source src="{0}">
</video>
""".format(output))
Out[28]:

Below is an alternative way to create a video in case the above cell did not work.

In [29]:
import io
import base64
video = io.open(output, 'r+b').read()
encoded_video = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded_video.decode('ascii')))
Out[29]: